Our Analytics Platform is a Factory of Business Intelligence
Our Analytics Platform is a Factory of Business Intelligence
SBA Data Source and Methodology
Data Source
The original source of SBA loan data is derived from data released by the U.S. Small Business Administration (SBA). The SBA collects individual loan data from the SBA lender/bank approving and providing SBA loans. The SBA then makes basic loan data public through the FOIA (Freedom of Information Act) requirements. The SBA FOIA data is imported into the SBADNA advanced analytics platform which is then used to generate the reports and rankings shared on LoanBox.
SBA 7(a) Program
All SBA lending data on this site is based on the SBA’s flagship 7(a) program and does not include loan data for the 504 Program, or PPP loan disbursements or recipients.
Data Integrity
As with most big datasets, the U.S. Small Business Administration (SBA) reported data is imperfect, relying on a bank representative to input the data, on the SBADNA analytics platform to calculate the data accurately, and on LoanBox to post the correct data to the right pages and categories. Sometimes data points will have an erroneous entry or placement and the percentage of erroneous data vary by lender. Different lenders take different levels of care in the accuracy of the data provided to the SBA. Older data is less reliable than recent data.
Loan Status
Loan Approvals and approval amounts reported include all loan statuses. If a loan was approved but then cancelled, we still include these counts unless otherwise noted. When a loan approval is cancelled it is usually due to the borrower not moving forward with the loan (for a multitude of reasons) and not because of the lender’s unwillingness to fund a loan that has been approved. Lenders do frequently update their information on loan status but updated data isn’t available to us until the following quarterly SBA data release(s).
Forecast Model
Forecasting is used to make predictions for specific future time periods. Our forecast model is far from a crystal ball, it’s just an AI-powered best guess. For lenders and franchises with fewer than 10 years of data, or when the loan data category has minimal or erratic data, the forecast quality suffers. Our forecast model is based on Holt-Winters Forecasting model a.k.a. Triple Exponential Smoothing. The forecast is calculated as a weighted average of all historical data. Recent data is weighted exponentially higher than older data. The Holt-Winters method is based on either additive or multiplicative components. The AI-powered platform internally calculates results by both algorithms and shows the best results of the two.
Why our data may vary from other sources
Based on “apples to apples” comparisons all sources will be very close to each other. Usually when there is a discrepancy it is because of:
Time Frame. SBA bases annual numbers on the SBA fiscal year which ends September 30th. We report for fiscal year, calendar year, trailing 12 months, quarterly, and also generate reports for last 5 years, 10 years, 20 years and all-time.
Duplicates. Other sources may not be accounting for duplicate data that has been merged or purged from our database. We have merged tens of thousands of duplicate loans, franchise brands, and cities, and continue with duplicate purging efforts. When a bank purchases another bank we merge the two bank names into one.
Approved vs. Disbursed. Other sources only use loan “approvals” as their measuring stick. A loan approved does not always equate into a loan disbursed. While approvals are the general variable we use, we often report on actual disbursements (funded loans) as well. Disbursed amounts will always be lower than approved amounts.
Cancelled Loans. If a loan was approved but then cancelled, we still include these counts unless otherwise noted.
Franchise Insiders. Franchisors and franchise based associations and analytics firms have additional data they may report on specific franchise brands, distributors, and sub-franchisees that may not be able to be captured through SBA data analytics alone.
No Warranty of Data Accuracy or Completeness. LoanBox and SBADNA does not validate or verify the data released by the SBA because it would be impossible to do so. While extreme efforts have been made to provide accurate data we access, we do not make any representation as to the completeness or accuracy of the SBA loan data reported on this website. In fact, due to the current structure and process of how SBA data is collected, managed, and shared by the SBA we can guarantee not all counts will be 100% accurate.
How the SBADNA analytics platform was built and designed
SBADNA has changed how SBA 7(a) lending intel can be found and harnessed.
Business owner intel previously unavailable now is.
Rich insights previously hidden are now transparent.
Difficult data to mine is now easily found in seconds.
SBADNA source data includes all SBA 7(a) loans (about 1.7 million), historical prime rate data, SBA’s Franchise Directory, and Sector, Subsector, and Industry Group codes. We then cleaned erroneous data, merged duplicates, fixed thousands of typos and misspellings, and merged duplicate franchise brands and those listed with multiple spellings. Cleaning data is an ongoing process, and we recognize this type of dataset can never be 100% “clean” but this is still the goal we are striving for.
After the data is sourced additional data points are extrapolated creating an even larger dataset. For example, the average loan size, rate spread, and loan life are not included in the source data but can be determined from the source data. Knowing the NAICS industry code means you can also know that code’s Industry Group, Subsector and Sector. Having the dates of loan approvals and disbursements allows for analytics to be applied to calculate YOY, TTM YOY, trends, and forecasts. We added dozens of these types of additional datapoint calculations exponentially increasing the robustness of the datasets.
We segmented intel into categories such as Rankings, Activity, Risk, and Trends as well as Lenders, Industries, Franchises, and Geography. Then we created dashboards for each category to make it easy to find and harness the intel.
We then created Top Lender Indexes so any bank could benchmark and compare against the average top 100, 50 and 20 SBA lender.
Time period buttons were created so you can easily and instantly change the reporting for any calendar year, fiscal year, quarter, half year, or customize the date range you want to see.
Interconnected filters and interactive charts and graphs were implemented to make intel faster to find and more interesting to engage with. The interconnected dropdown filters mean that if you select the Project State drop down filter and select “Texas” then all of the other filters will only show filter options for project state Texas. The interactive charts and graphs mean that if you scroll over the segments a popup window will show more specifics and if you click it then the rest of the dashboard changes to show results for what you selected.
To make the Intel Dashboards more user friendly, each user can customize their own dashboard views. This way you can set up filters for the intel most important to you, save and name the dashboard view, and then when you return to use again you can select different saved views instead of selecting all the same filters again. You can even set any saved view as the “default view” so when you go to the dashboard in the future it will automatically show the report you most want to see.
In addition to making intel easy and fast to find, we also made the intel easy to share. Any of the dashboard reports can be saved as a high quality png image. Individual charts, graphs, maps, and lists can also be downloaded as an image or as an excel or csv document. This makes any intel easily sharable with prospects, clients, coworkers, or to be added into reports and loan presentations.
While the advanced analytics platform was a monumental endeavor to create, the real challenge was making this newly available intel ready for human consumption, that is, easy to find and harness for the masses. The purpose of the Intel Dashboards’ design and functionality is to make it intuitive enough for any non-technical person to utilize without any training required. After clicking around on a dashboard for a moment or two the user “gets it.” Each dashboard is essentially the same: select any filters you want to use, select any time period you want to see, and then interact with the intel.
The final step was making an analytics platform that is expensive and time consuming to create and maintain, affordable for anyone to access. We accomplished this by designing the platform in a way that can be accessed through a community cloud website. Just go to the website, login, and start using.
That’s the SBADNA platform in a nutshell. It’s easy, fast, intuitive, interactive, powerful, and affordable. Lending intel that was previously unavailable now is. Data that used to take a lot of time and effort to find, now doesn’t. The rich insights that have been previously hidden in big lending datasets can now be found in seconds with clicking a few buttons.
This article is authored by Darin Manis, founder of LoanBox.